Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yieldthought/llama31 8b/ttembed #12560

Merged
merged 3 commits into from
Sep 12, 2024
Merged

Yieldthought/llama31 8b/ttembed #12560

merged 3 commits into from
Sep 12, 2024

Conversation

yieldthought
Copy link
Contributor

@yieldthought yieldthought commented Sep 12, 2024

Ticket

Link to Github Issue

Problem description

Use on-device embeddings to reduce host-device communication

What's changed

  • On-device embeddings used for demo and demo_with_prefill
  • Demo perf improved to 17.7 t/s/u with DISABLE_DI_DT_WORKAROUND=1, 15.1 t/s/u otherwise

Checklist

@yieldthought
Copy link
Contributor Author

@sraizada-tt is working on a full integration with argmax and tracing which will further improve perf, but we can land this earlier and it gives us the best perf so far

Copy link
Contributor

@mtairum mtairum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving this now. Waiting for a test (~30 min) from the previous PR to finish and merge it then we can rebase this one and merge after.

@mtairum
Copy link
Contributor

mtairum commented Sep 12, 2024

The other PR is now merged. Just rebased this one and kicking the demo pipeline. The other pipelines are not needed, since this doesn't touch them.

https://github.com/tenstorrent/tt-metal/actions/runs/10829713986

@mtairum mtairum force-pushed the yieldthought/llama31_8b/ttembed branch from f8e9156 to a93ea76 Compare September 12, 2024 12:36
@mtairum mtairum merged commit 9014138 into main Sep 12, 2024
6 checks passed
@mtairum mtairum deleted the yieldthought/llama31_8b/ttembed branch September 12, 2024 12:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants